課程大綱

課程資訊

課程名稱	機器學習 Machine Learning
開課學期	112-1
授課對象	重點科技研究學院奈米工程與科學博士學位學程
授課教師	舒貽忠
課號	AM7192
課程識別碼	543 M1180
班次
學分	3.0
全/半年	半年
必/選修	選修
上課時間	星期三6(13:20~14:10)星期五7,8(14:20~16:20)
上課地點	應113應113
備註	總人數上限：60人

課程簡介影片
核心能力關聯	核心能力與課程規劃關聯圖
課程大綱
為確保您我的權利,請尊重智慧財產權及不得非法影印
課程概述	The course provides a comprehensive introduction to machine learning, with a primary emphasis on the fundamental principles governing learning algorithms. It covers a wide range of topics, including: (1) Supervised Learning: generative and discriminative probabilistic classifiers (Bayes/logistic regression)、least squares regression、Neural Networks (Convolutional Neural Networks, Recurrent Neural Networks)；(2) Probabilistic Graphical Model: Hidden Markov model (HMM)；(3) Basic Learning Theory：PAC learning and model selection. This course aims to provide students with a robust foundation essential for conducting research in machine learning.
課程目標	Upon completion, students will be proficient in utilizing calculus, linear algebra, optimization, probability, and statistics to create learning models for diverse real-world challenges. Moreover, they will be well-prepared for advanced research in machine learning and related domains.
課程要求	中文授課，以板書形式，講解機器學習演算法數學原理。The course is taught in Chinese, utilizing the blackboard for writing and explaining the mathematical principles of machine learning algorithms.
預期每週課後學習時數
Office Hours	另約時間
指定閱讀	待補
參考書目	1. C. M. Bishop. Pattern Recognition and Machine Learning, Springer, 2006 2. Shai Shalev-Shwartz and Shai Ben-David. Understanding Machine Learning: From Theory to Algorithms, Cambridge University Press, 2014. 3. O. Calin. Deep Learning Architectures: A Mathematical Approach, Springer, 2020 4. K. P. Murphy. Probabilistic Machine Learning: An Introduction, MIT Press, 2022 5. Y. S. Abu-Mostafa, M. Magdon-Ismail and H. T. Lin. Learning From Data, AMLbook, 2012 6. E. Alpaydin. Introduction to Machine Learning, MIT Press, 2020.
評量方式 (僅供參考)

課程進度

週次	日期	單元主題
第1週	9/06,9/08	Mathematical formulation of a learning problem, Evaluation of a model (loss function), Generalization error, Empirical Risk Minimization (ERM), ERM with inductive bias, Bayes optimal classifier
第2週	9/13,9/15	Example of Bayes optimal classifiers, Polynomial Threshold Functions, Overfitting, Generalization/Empirical errors vs model complexity
第3週	9/20,9/22	Example for explaining No-Free-Lunch Theorem, Perceptron Learning Algorithm (PLA) for linearly separable data
第4週	9/27,9/29	Mean, standard deviations, Bernoulli distribution, examples for Bayes Theorem
第5週	10/04,10/06	Naive Bayes Classifier based on Bernoulli distribution, Maximum Likelihood Estimation (MLE), Algorithm, example (classification of hand-written digits)
第6週	10/11,10/13	Naive Bayes Classifier based on Gaussian distribution, Maximum Likelihood Estimation (MLE), Algorithm, decision boundary
第7週	10/18,10/20	Confusion matrix, ROC curve, discriminative probabilistic model (Logistic Regression)
第8週	10/25,10/27	Logistic Regression (sentiment example), comparison between generative and discriminative models, MLE for learning parameters
第9週	11/01,11/03	Optimization, gradient descent, example from logistic regression (in-class coding), stochastic gradient descent, comparison with PLA, nonlinear classifiers using nonlinear transformation
第10週	11/08,11/10	Neural Networks: abstract neuron, AND, OR and XOR problems, multi-layer perception (MLP), mathematical definition, revisit XOR by Boolean operation
第11週	11/15,11/17	Neural Networks: explain why the direction calculation of the gradient of loss function with respect to the weights is not efficient; introduce and derive the algorithm of Backpropagation
第12週	11/22,11/24	Convolutional Neural Network (CNN), convolution in 1D and 2D signals, cross correlation, convolution layer vs fully-connected layer, characteristics of CNN: sparse and weights sharing, receptive field
第13週	11/29,12/01	Why need convolution? an example of Sobel operator for edge detection, CNN architecture
第14週	12/06,12/08	Pooling, CNN Explainer, backpropagation in CNN (derivation)
第15週	12/13,12/15	Information Entropy, Shannon's source coding theorem, cross-entropy loss, Kullback-Leibler divergence
第16週	12/20,12/22	Final Exam